Skip to content

Conversation

@stefaniereuter
Copy link

@stefaniereuter stefaniereuter commented Aug 26, 2025

Description

Opening this as a draft pull request, while finishing the refactoring on the grib comparison but to start discussing the design and features.

This draft shows the direction in which I would like to continue refactoring.

This tool allows to compare two FDB or two experiments in one FDB. It uses the standard fdb tool mechanisms, minimal-keys, CLI request, but also allows to directly specify mars keys, where entries should be ignored. it uses fdb list to generate a request then sorts the results and compares them, depending on scope, where the default is just a comparison of the mars keys, without the messages itself, possibility of comparing grib-header only and then grib-header plus data section. Default is a key by key comparison,
More explanation and examples can be found here:

https://confluence.ecmwf.int/display/~ecm4053/FDB+Comparison

Contributor Declaration

By opening this pull request, I affirm the following:

  • All authors agree to the Contributor License Agreement.
  • The code follows the project's coding standards.
  • I have performed self-review and added comments where needed.
  • I have added or updated tests to verify that my changes are effective and functional.
  • I have run all existing tests and confirmed they pass.

🌈🌦️📖🚧 Documentation 🚧📖🌦️🌈
https://sites.ecmwf.int/docs/dev-section/fdb/pull-requests/PR-165

🌈🌦️📖🚧 Documentation Z3FDB 🚧📖🌦️🌈
https://sites.ecmwf.int/docs/dev-section/z3fdb/pull-requests/PR-165

🌈🌦️📖🚧 Documentation FDB 🚧📖🌦️🌈
https://sites.ecmwf.int/docs/dev-section/fdb/pull-requests/PR-165

@stefaniereuter stefaniereuter requested a review from Ozaq August 26, 2025 06:36
@stefaniereuter
Copy link
Author

This was supposed to be a draft pull request....

@stefaniereuter stefaniereuter marked this pull request as draft August 26, 2025 06:43
@dsarmany dsarmany force-pushed the feature/FDB-327-fdb-compare branch from 684c81d to 722f460 Compare October 21, 2025 09:34
@pgeier pgeier force-pushed the feature/FDB-327-fdb-compare branch from 722f460 to 38d2604 Compare January 13, 2026 11:08
@stefaniereuter stefaniereuter force-pushed the feature/FDB-327-fdb-compare branch from 7b4aaf0 to 232d0b2 Compare January 15, 2026 12:38
@pgeier pgeier force-pushed the feature/FDB-327-fdb-compare branch from d6efd5d to 957b3c0 Compare January 16, 2026 12:51
@pgeier pgeier force-pushed the feature/FDB-327-fdb-compare branch 3 times, most recently from f6064e4 to c814917 Compare January 27, 2026 14:49
@pgeier
Copy link

pgeier commented Jan 27, 2026

Pending tasks:

  • add license to source and header files
  • document functions
  • fix tests
  • add data check tests

@pgeier pgeier force-pushed the feature/FDB-327-fdb-compare branch from c814917 to 2daf3e2 Compare January 27, 2026 14:54
@pgeier pgeier force-pushed the feature/FDB-327-fdb-compare branch from bcef5ed to b643bc8 Compare January 28, 2026 14:54
@codecov-commenter
Copy link

codecov-commenter commented Jan 28, 2026

Codecov Report

❌ Patch coverage is 54.10053% with 347 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.66%. Comparing base (8abf5d2) to head (1b031b4).

Files with missing lines Patch % Lines
src/fdb5/tools/compare/grib/CompareBitwise.cc 0.00% 105 Missing ⚠️
src/fdb5/tools/compare/grib/Compare.cc 51.72% 56 Missing ⚠️
src/fdb5/tools/compare/grib/CompareKeys.cc 48.00% 52 Missing ⚠️
src/fdb5/tools/compare/fdb-compare.cc 73.18% 37 Missing ⚠️
src/fdb5/tools/compare/grib/CompareHash.cc 0.00% 32 Missing ⚠️
src/fdb5/tools/compare/common/DataMap.cc 73.68% 20 Missing ⚠️
src/fdb5/tools/compare/common/Util.cc 0.00% 19 Missing ⚠️
src/fdb5/tools/compare/common/Util.h 0.00% 6 Missing ⚠️
src/fdb5/tools/compare/grib/Utils.cc 78.57% 6 Missing ⚠️
src/fdb5/tools/compare/common/ComparisonMap.cc 79.16% 5 Missing ⚠️
... and 2 more
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #165      +/-   ##
===========================================
- Coverage    73.30%   72.66%   -0.64%     
===========================================
  Files          363      377      +14     
  Lines        21956    22712     +756     
  Branches      2253     2383     +130     
===========================================
+ Hits         16094    16504     +410     
- Misses        5862     6208     +346     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pgeier pgeier force-pushed the feature/FDB-327-fdb-compare branch from 3508d48 to 1b031b4 Compare January 29, 2026 09:49
@pgeier pgeier marked this pull request as ready for review January 29, 2026 10:02
@simondsmart simondsmart self-requested a review January 30, 2026 10:45
@@ -0,0 +1,7 @@
ecbuild_configure_file( mismatch_mars.sh.in mismatch_mars.sh @ONLY )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect to see an ecbuild command getting test data (probably for more than one test), and then the test depending on it.

It isn't great to add meaningful-sized binary files to the git repo...

Copy link

@pgeier pgeier Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is following the same approach as the regression tests.
I can avoid duplicating them. Or do you explicitly want them to be uploaded on get.ecmwf.int

@@ -0,0 +1,7 @@
ecbuild_configure_file( mismatch_grib.sh.in mismatch_grib.sh @ONLY )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the data.grib files differ between the different tests? It looks like you are grib setting them, to set upthe tests on the go. So we probably only need one grib file. You migt even be able to use one which is already referenced in a different test, which would avoid the extra data.


FDBCompare(int argc, char** argv) : FDBVisitTool(argc, argv, "class,expver") {
options_.push_back(new SimpleOption<std::string>("test-config", "Path to a FDB config"));
options_.push_back(new SimpleOption<std::string>("reference-config", "Path to a second FDB config"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it important that there is a test and a reference? Is the compare not symmetric? I would have thought that --config-one and --config-two would suffice?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparison is symmetric, it's just about naming. Don't have an opinion on that. It's just that the convention is used at various places in the code right now.

options_.push_back(new SimpleOption<std::string>("test-config", "Path to a FDB config"));
options_.push_back(new SimpleOption<std::string>("reference-config", "Path to a second FDB config"));
options_.push_back(new SimpleOption<std::string>(
"reference-request", "Mars key request for reference FDB entry (e.g reference experiment)"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "mars key request" that is not a phrase that makes sense. At the least this needs an example.

Method parseMethod(const std::string& s);


struct Options {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange being in a file called Scope.

/// @param test Test location of the message
/// @param opts Additional options
/// @return Comparison result. As mars is not comparing data, only matched is expected to be set
Result compareGrib(const DataIndex& ref, const DataIndex& test, const Options& opts);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit that structurally, I would have written this with a comparator class (initialised with Options). And then have the different types of comparison be internal to that class (probably as member functions, but they could be in an anonymous namespace).

Roughly happy with this as is, but stylistically it is not really the same as the rest of the FDB/MARS code in that regard. I would err towards making it so.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explicitly removed a factory. The comparison calls on mars and grib are explicitly called in order. I don't see a reason to wrap a layer of indirection.

}
else {
fdbtest = FDB(config(args));
fdbref = FDB(config(args));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this re-assignment results in default-initialising (and constructing) two FDBs. Depending on the current environment that may fail (throwing an exception) before you try and construct the FDBs with the supplied configs. That would be an erronious failure mechanism.

I think that the easiest solution here is to create a function returning a pair<> of FDBs (or perhaps better, a single use struct containing two FDBs) such that you can avoid this.

}

if (refReqString_) {
opts_.referenceRequest = parseKeyValues(*refReqString_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't clear to me what a reference/test request is. And I am hardly a new user to this ecosystem. This needs some elaboration/commentary.

}


// Return if only a comparison of Mars metadata messages was specified as Command Line option
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does only this line warrant a comment. There are bits that could do with explanatory code. Not convinced this is the most significant...

if (opts_.scope == Scope::All) {
std::cout << "****************** SUMMARY ********************" << std::endl;
std::cout << gribRes << std::endl;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect a compare tool to return success or failure depending on the output. This very much looks like it prints the output, but then returns zero. That makes it much harder to integrate into wider scripts.

Copy link
Contributor

@simondsmart simondsmart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few extra details...


for (const auto& [rk, rv] : r) {
auto search = l.find(rk);
if (search == r.end()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug here. search comes from l.find but is being compared against r.end. Iterators are not comparable between containers.

}

if ((testReqString_ && !refReqString_) || (!testReqString_ && refReqString_)) {
throw UserError("Options --reference-request and --tests-request must either both be set or both be omitted.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be --test-request not --tests-request


// EditionNumber will is always the 8th Byte in a Grib message
int editionnumberRef = (int)bufferRef[7];
int editionnumberTest = (int)bufferRef[7];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both editionnumberRef and editionnumberTest are read from bufferRef, and so are equal by construction. Should be using bufferTest.

Please add a test to check that this is working.


if (holdsVector(valRef)) {
if (holdsVector(valTest)) {
auto l1 = vectorLength(valRef);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that vectorLength returns bool. So this is only testing if they are both empty/both have data. Doesn't test lengths.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. good catch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants